11,668 results • Page 2 of 234
Hi, I would like to rename the sequences in my file to have leading zeros in front of the header name using unix or perl based on the number of sequences
updated 8.2 years ago • rrsowmya
o alaikum everyone, I am working with multiple genes and in each gene folder i have multiple FASTA (70-75) files and each FASTA file contains single gene sequence. e.g. AMY2b_Gene_folder Chimpanzee_AMY2B_CDS.fasta...gt; AGCTCCCAAGGGATTTGGAGGGGTTCAGGTCTCTCCACCAAATGAAAATGTTGCAATTCACAACCCTTTC I want to change headers of each fasta file according to a specific order given in text file. …
updated 6.7 years ago • adeena_hassan
Hello people, I hope you are well. I was wondering if you can help me, I need to batch rename a large amount of RefSeq genome files ".fna" format. Below I show you an example of the file headers: GCA_000007685.1_ASM768v1_genomic.fna...TTGTTTTTACTACTTGGATATATGAAAAAATACTTTGGAACTTGTTTCAAAAGTTAGAATGTGGGGTCTTCTTCAAAAAA My idea is to rename them to look like this: Li_serovar_Copenhageni_Fi…
updated 14 months ago • BATMAN
I am trying to open the fasta file with sequences. My bio perl script opens the sequence but not with fasta header. How can I get fasta header with bio...perl -w use Bio::SeqIO; $seqio_obj = Bio::SeqIO->new(-file=> "no_plasmid.fasta", -format=>"fasta"); $seq_obj = $seqio_obj->next_seq; $acc = $so->accession_number; while($seq_obj = $seqio_obj->next_s…
updated 7.8 years ago • bandanaschapagain
Hi, I have a fasta file with 300 protein sequences. I intend to construct a phylogenetic tree with it. I would want only the accession number...and the organism name in the fasta header and remove the rest of the information. Can anybody suggest how to do this? I have a linux based system with perl...and python installed. For example, i want to convert a header like this: >gi|685204…
updated 8.0 years ago • bkvijay.jayaraman
I have a fasta file with headers that I want to compare with two other text files and if it's present in the first file, put it first on the...to compare the text files to the fasta header and if there's a match, organize/reorder the fasta header file so that that match name is the first entry on the header...look like this. (for the first one since fly is present in file2, place it as the first …
updated 18 months ago • jnora0625
Hi, I have 10 fasta files (each file with 20 gene sequences from each of the 10 samples). I would like to create 20 files, specific to each gene...from 10 samples. I proceeded as follows to extract genes with the file_name in header: pyfasta extract --header --fasta test.fasta gene_name1 | awk '/^>/ {$0=$0 "_sample1"}1' > gene_name1.fasta Output: >gene_na…
updated 6.7 years ago • bioinfo8
I have fasta file namely `119XCA.fasta` as shown below, >cellulase ATGCTA >gyrase TGATGCT >16s TAGTATG I need to remove all the...fasta headers, keep the sequences one by one and need to write file name as a fasta header. The expected outcome is shown below...TAGTATG I have used the following script `sed '/^>/d' foo.fa > out.fa` which re…
I have strange fasta headers like this for some good number of sequences, >gi|61221638|sp|P0A366.1| >gi|61221640|sp|P0A368.1|CR1AA_BACTE...I would like to replace the other (`>gi`) in the fasta header to blank or `;`. Can anyone suggest how to do it. I have many such sequences in a big fasta file
updated 4.7 years ago • empyrean999
Hi, pls, let me know how can i edit the fasta file header. >LR99555.1 Avo, chromosome: 1 I want this header like this. >LR99555.1
updated 2.1 years ago • p
if this has been asked before, but I have a genome assembly file that I just converted from .bam to fasta format in order to start annotation. I would like to run CEGMA on this assembly, because I have concerns about the quality...but the problem is that the default header format when the fasta was created is not acceptable. This is because in the current format here are 5237924 sequences...with …
updated 23 months ago • zgayk
I have large fasta file. As you see below there are > sign present in some fasta header like >exon2_ENST00000218032|>exon2_ENST00000218032...gt;exon17_ENST00000253024|>exon17_ENST00000253024 I want to remove the >sign from the header sequence, after remove the header is then look like this >exon2_ENST00000218032|exon2_ENST00000218032 &…
updated 3.1 years ago • harry
I have downloaded a reference uniprotkb FASTA file. How can I only extract the FASTA headers of each gene (raw-wise) into a CSV file using R
updated 14 months ago • WUSCHEL
I have a FASTA-file like this: >seqA AAAAAAAAAA >seqB AAAAAAAAAA >seqC TTTTTTTTTT >seqD CCCCCCCCCC >seqE CCCCCCCCCC >seqF...AAAAAAAAAA I'm recently learning SeqKit, and I've found that rename can append _N in the header based on the occurrence of the sequence, and also that rmdump can remove…
updated 7 weeks ago • Broccoli
converting the file from fastq to fasta SeqIO.convert(seq_file,"fastq",labels[0]+".fasta","fasta") no problem; but now I would like to change the header of the fasta file...m stuck. When I add the `SeqIO.parse` function like this for seq_record in SeqIO.parse(labels[0]+".fasta","fasta"): seq_record.id = labels[0] # renaming the pseudogene with the lab id SeqIO.write(seq_…
updated 3.1 years ago • skbrimer
Hi community, I am not an expert with sed but i want to edit the headers of each sequence in a fasta file. I want to let only the gene id **>NODE_39_length_59461_cov_85.505003_1** The header
updated 2.3 years ago • Candela
Hi, I would like to modify the fasta headers from a file. I would like to change: >A0A0F2M4U6|A0A0F2M4U6_SPOSC Endoplasmic reticulum chaperone BiP OS
updated 2.5 years ago • marcus.teixeira
Hello I have a lot of sequences in a FASTA file, and I want to extarct a specific sequence knowing the header ID. for example the header of a sequence is: NODE_19_length_5758_cluster_19_candidate_1...I know that with `grep` I can extract the header, but i want the below sequences to appear on stdout. How can I do this on bash
updated 3.4 years ago • v.berriosfarias
In this example of fasta sequence, you see there is some repeat of fasta sequence many times.for example- exon19_ENST00000194900|exon21_ENST00000194900...exon18_ENST00000194900|exon21_ENST00000194900 So I want to remove all fasta sequence which has the same header in the fasta file and keep only 1 fasta sequnece. I want to remove fasta sequence on...the basis of header not the sequence. Thanks in…
updated 3.2 years ago • harry
Hey guys, I have a multi-fasta protein file like this >SF_hydrolase MKG... >LH_reductase MKI... >SM_hydrolase MSN... Basically, I would like to extract...only the fasta headers that have the other "reductase". I know how to extract headers that have the same headers as the ones present on...a list, but I don't know how to extract fasta-headers solely based on o…
updated 5.2 years ago • genomes_and_MGEs
Hello Everyone Can anyone you guide me editing of the fasta header file. My fasta header file shown as below >NP_006556.1 transcriptional repressor CTCF isoform 1 [Homo sapiens
updated 3.6 years ago • bioinformatics.queries
So I have a director full of fasta files and I want to change the fasta header in each one by the name of their corresponding fasta file. For example: HC1993.fa...gt; X58834 CCTGCATCTGCAA HC1993.fa > HC1993 CCTGCATCTGCAA I have about 50 fasta files like that in a directory that I was to do the same thing to. I've been using this sed command for one file that works...sed '…
updated 3.9 years ago • tpaisie
Hey all, I'm working with a lot of data from NCBI and at the moment I'm kind of stuck. I have a ton of fasta files, either containing genomic contigs or the 16S sequences I extracted from those genomes using RNAmmer. The files were automatically downloaded from NCBI and are named like this: GCF_000284355.1_ASM28435v1_genomic.fasta (for contigs) GCF_000284355.1_ASM28435v1_genomic_16S.fasta (for …
updated 5.1 years ago • Guillaume.Tahon
Hello everybody! I have a fasta file I'm looking to work with in qiime. Unfortunately, it doesn't currently meet their formatting requirements. I need...to change headers like this: >3180275|DCO_MAC_Bv6--LI09_3|40099 XXXXXXXXXXXXXXXXXXXXXX >13488354|DCO_MAC_Bv6--LD09_2_3|2 XXXXXXXXXXXXXXXXXXXXXX...gt;333430241|DCO_MAC_Bv6--LO13_8|1 XXXXXXXXXXXXXXXXXXXXXX To…
updated 12 months ago • Dani
courtesy) 201200175|A|name1|175|2012 201200287|A|name2|287|2012 201200845|A|name3|845|2012 my fasta file looks like.. >201200175 >201200287 >201200845 I want the output like... >201200175|A|name1|175|2012 >201200287
updated 4.5 years ago • Shaminur
I have 5000 FASTA sequences with Uniprot ids. Now, I want to add a unique identifier at the beginning of each FASTA header. An example will...And so on I want to add ABC0001 to ABC5000 at the beginning of the fasta header. And the corresponding gene name from my txt file. gopA ABC0001 A12345 gopD ABC0002 B57384 ........................ fotR ABC5000 C12345...And so on As I understand, I …
updated 10.5 years ago • bioinfo
Hi all friends, I have a large fasta file that most sequences have a identical header (they differ from the length). I usually extracted the sequences of interest...requires the Biopython library" sys.exit(0) try: fasta_file = sys.argv[1] # Input fasta file wanted_file = sys.argv[2] # Input wanted file, one gene name per line result_file = sys.argv[3] # Output fasta file …
updated 7.2 years ago • seta
Hello; I need to process fasta header by matching fasta description (not fasta id) with a first column in a another file with two columns and print second...column in file on to fasta header. Here are examples and what i have till now. file1.txt (list file) group_1 gene 1 group_2 gene 2 group_3 gene 3 group_4...my $input; close $infile; { local $/ = undef; …
updated 7.8 years ago • empyrean999
Hello everyone, I try to replace the headers A of a FASTA file (file.fasta) with headers B. For this, I have a list which match the headers names. >A_1 >B_1 >A_2 &gt...B_2 >A_3 >B_3 etc... I am using this loop to replace the headers: cat list | while read f ; do echo $f > temp_file A=$(awk '{print $1}…
updated 3.6 years ago • Begonia_pavonina
Hi everyone! I need help with something. I am very new to bioinformatics. I have a fasta file with 32K reference sequences for an X gene. The headers are the Accession numbers, but I need to change them for the...So I think I already did the hardest part) but now I need to combine this information and change de headers of my fasta for the GI of each sequence. I've tried with this script: ``` …
updated 21 months ago • marcelavillegasp
I have a fasta file with the following format: >BNY.1.2.t17987.mrna1 CDS=1-1065 seq... How can I remove everything after ".mrna1" from...the headers
updated 4.3 years ago • 2822462298
As above I have long fasta name file and i want to rename it by just include first and last name like :- >exon9_ENST00000462434:exon25_ENST00000462434
updated 3.2 years ago • harry
Hello, I have a list of headers, I need to extract the sequence from the fasta file. how can I do it? kindly let me know. The header file looks like this &gt...gt;TRINITY_DN74659_c0_g1_i1 >TRINITY_DN74659_c0_g1_i1 >TRINITY_DN74698_c0_g1_i1 fasta file looks like this >TRINITY_DN74697_c0_g1_i1 len=243 path=[221:0-242] [-1, 221, -2] GTATGTCCCACCAGACAC…
updated 23 months ago • Princy
of interest. I am almost done with the script. But I would also like to include gene names in the fasta headers. By default, it only include corrdinates in fasta headers. Below is my script: >coords=Chr1 1000 2000 forward...gt;TTTGGGGTTATAAATTATTAGAAGTT...... I was wondering if there is a way to include the gene name in fasta header. Thanks, R
updated 7.2 years ago • RT
I am a newbi for linux stuff... I would like to modify the header of fasta file. **My header is like: >100123_00010T gene=100123_00010** **And, I would like to have headers like "100123_00010
updated 13 months ago • hellokwmin
Hi, I have a fasta file, which has some same headers like below. They have different sequence but same header. How can I merge them or what...should I do? I want to run orthoMCL but it requires unique headers. ``` >c12358_g1_i9 >c12358_g1_i9
updated 21 months ago • Mehmet
Hello, I am trying to convert my vcf files to fasta. However, after aligning to reference, vcf ID from the header disappears, and bcftools/vcftools are writing only reference...seq name in file header. Like > NC_xxxx.1 Any ideas? I run consensus script like for file in $inpath/*.vcf ; do echo $file bname=$(basename $file) echo...base name is …
updated 3.6 years ago • storm1907
Does anyone have a handy method for making a fasta header comply with the UniProt header specifications? http://www.uniprot.org/help/fasta-headers In particular, I would
updated 7.2 years ago • nickp60
Hi, I have protein fasta file whose headers look like '>evm.model.chr.9.52'. There are almost 30k+ proteins. I have performed functional annotations...Now, I al performing some analysis and I want to add atleast protein name or even GO term in fasta header so it would make things alot easier for me. I want something like; >evm.model.chr.9.52 GO:1234678 Can I do it with
updated 15 months ago • ahmadjoyyia
Hello All, I have a multi fasta file with millions of sequences. I want to duplicate a part of the header and join it to the header itself with a pipe, while...another part (of the header) should be deleted. Let's say I have a fasta file, "input.fasta," which looks like this: >Gene1 wbdfwbf ATGCCGATGCAGTGACG...f 1 < input.fasta > out1.fasta` for deleting spa…
updated 2.1 years ago • bionix
reference genome sequence using the BWA software, and it gave me a .sam file. I used samtools SAM to FASTA to convert the aligned reads to fasta file. I want to look at assembly statistics and also evaluate completeness with...BUSCO. I received the following error: **The character "/" is present in the fasta header >A00600:204:HFMJ3DSX3:3:1101:3640:1125/1, which will crash Reader. Please…
updated 18 months ago • hpalk42
Hello, I have a text file with thousands of unique sequences in fasta format. Each read has a header in the following format: 122391_Tcount2352_Acount2352_Bcount0_length293 It's obvious...was used as some point in the pipeline. I'm curious to see if anyone here has encountered this header format before and can tell me which part of the sequence header represents the count of reads. Thanks …
updated 5.3 years ago • genya35
I have a fasta file with hundreds of sequences and their respective headers. The headers (all of them) are in the format >ABCD [id_123...I have a fasta file with hundreds of sequences and their respective headers. The headers (all of them) are in the format >ABCD [id_123] (gene_XYZ) [protein_ijk] [protein_id=qqq] [123..899] .......seqeunce............ >…
updated 7.3 years ago • leo1985.arnab
Hello I have a fasta file with sequence headers written as ``` >0|quiver|1..2075|- >0|quiver|2210..3058|- >0|quiver|3112..4169|- ``` and so on till around
updated 21 months ago • utkarsh.sood
Im wondering about the most straightforward way to extract the interval information contained in a fasta header such as the one below, thanks! Also maybe to pipe into a newly created bed file. >Mouse|chr12:112380949-112381824
updated 6.3 years ago • rbronste
I am sure that someone will do this work faster and better than me. I would like to edit multiple fasta header from this format. >M01380:50:000000000-AV1DH:1:1101:16094:3001 1:N:0:M636:16S_V1V3 TTCTGCCT|0|TAGACCTA|0 CS1_534R_YM3_for...3|27| to this one: >M636 As you can see "M636" is embedded in the mayor header. Thank you for always helping everybody! D
updated 6.9 years ago • DVR
I want to extract **gene name** , **gene start position** and **gene stop position** from the fasta header of the fasta file. I have tried to extract based on the position but those locations are not consistent. Is there...and 17th element from this list. It works for this particular example. This does not work for other headers where these positions are different. Usually, gene name is consisten…
updated 3.9 years ago • lokraj2003
Hello! I have a FASTA file and I need a script that read in the file, changes all the headers to e new format and writes out all the sequences in...Hello! I have a FASTA file and I need a script that read in the file, changes all the headers to e new format and writes out all the sequences in a new output file. The modified headers should contain, for each sequence, the species name (with "_" r…
updated 6.4 years ago • mpbiology.dna
How to take a specific column in sequence header identifiers of fasta file? I am having my header such as: ``` >PGM0100236.1 [Candida] scaffold00238 >PGM0100236.1 [Candida...scaffold00241 ``` I would like to take my third column alone i.e scaffold00238 for all the headers in my fasta file. Please give a simple command solution. I am new to bioinfo and linux script. Thank you
updated 20 months ago • palani
Dear all, I want to add a special character "/1" to eacf of fasta header (at the end of fasta header) in a 8.5 GB fasta file. I used following command; perl -p -e 's/^(>.*)$/$1-New_Header_info/g' input.fasta
updated 9.3 years ago • vahapel
11,668 results • Page 2 of 234
Traffic: 2034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6